Skip to content

Conversation

@zulissimeta
Copy link
Contributor

@zulissimeta zulissimeta commented Jan 27, 2026

Summary of Changes

This PR adds support for FAIRChem batched inference via the InferenceBatcher from fairchem.core.calculate. This enables efficient GPU batching when running the same calculation (e.g., relaxations, static calculations) on many structures concurrently.

Key additions:

  1. New map_partition_fairchembatch function in src/quacc/wflow_tools/job_patterns.py:

    • A @job that processes a single partition (flat list) of atoms using FAIRChem batched inference
    • Submits calculations concurrently via threads while Ray Serve batches GPU requests
  2. New map_partitioned_lists_fairchembatch function in src/quacc/wflow_tools/job_patterns.py:

    • Dispatches partitioned data (list of lists) to map_partition_fairchembatch jobs
    • Mirrors the existing map_partitioned_lists / map_partition pattern
  3. New get_inference_batcher function in src/quacc/recipes/mlp/_base.py:

    • Creates and caches InferenceBatcher instances keyed by model configuration
    • Includes shutdown_inference_batchers() for cleanup
  4. Updated pick_calculator in src/quacc/recipes/mlp/_base.py:

    • Now accepts predict_unit kwarg to use batched inference
    • Bypasses caching when predict_unit is provided for thread-safety during concurrent execution

Example usage:

from quacc.recipes.mlp.core import relax_job
from quacc.wflow_tools.job_patterns import (
    map_partitioned_lists_fairchembatch,
    partition,
    unpartition,
)

# A very large array of possible atoms objects or structures to apply a job to
my_atoms_list = ...

# Partition atoms into batches for distributed execution
partitioned_atoms = partition(my_atoms_list, num_partitions = 10)

# Run batched FAIRChem relaxations on each partition
# Each partition runs on a separate worker with GPU batching
results_partitioned = map_partitioned_lists_fairchembatch(
        relax_job,
        atoms_list=partitioned_atoms,
        fairchem_model="uma-s-1",
        task_name="omat",
)

# Recombine results into a single list
all_results = unpartition(results_partitioned)

Tests:

  • Comprehensive test suite in test_fairchem_batch.py]
  • Includes end-to-end test with 10 copper structures
  • Includes performance test verifying batched inference speedup vs sequential

Requirements

@buildbot-princeton
Copy link
Collaborator

Can one of the admins verify this patch?

@zulissimeta zulissimeta changed the title first commit Add support for batched fairchem inference for many parallel jobs Jan 27, 2026
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

❌ Patch coverage is 24.35897% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.28%. Comparing base (173dc82) to head (a6e285a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/quacc/recipes/mlp/_base.py 24.00% 38 Missing ⚠️
src/quacc/wflow_tools/job_patterns.py 25.00% 21 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3121      +/-   ##
==========================================
- Coverage   97.69%   96.28%   -1.42%     
==========================================
  Files          97       97              
  Lines        4172     4248      +76     
==========================================
+ Hits         4076     4090      +14     
- Misses         96      158      +62     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zulissimeta zulissimeta marked this pull request as draft January 30, 2026 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants